{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Python数据分析之Numpy"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Python有着大量功能强大的第三方库。这些第三方库可以大大地扩充Python的功能，我们在实际使用中往往也离不开这些第三方库。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**NumPy是Python的一种开源的数值计算扩展。这种工具可用来存储和处理大型矩阵**，比Python自身的嵌套列表(nested list structure)结构要高效的多。NumPy(Numeric Python)提供了许多高级的数值编程工具。Numpy的一个重要特性是它的数组计算，是我们做数据分析必不可少的一个包。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "导入python库使用关键字import，后面可以自定义库的简称，但是一般都将Numpy命名为np，pandas命名为pd。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "使用前一定要先导入Numpy包，导入的方法有以下几种："
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {},
   "source": [
    "import numpy\n",
    "import numpy as np #推荐写法\n",
    "from numpy import * #不是很建议这种写法，因为不用加前缀的话有可能会与其他函数名称起冲突，因而报错"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 118,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1.Numpy的数组对象及其索引"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 数组上的数学操作"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "假设我们想将列表中的每个元素增加1，但列表不支持这样的操作："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "a = [1,2,3,4]\n",
    "#a+1 #报错"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[2, 3, 4, 5]"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "[x+1 for x in a]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "b = [2,3,4,5]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "与另一个数组相加，得到对应元素相加的结果："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 2, 3, 4, 2, 3, 4, 5]"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a+b #并不是我们想要的结果"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[3, 5, 7, 9]"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "[x+y for(x,y) in zip(a,b)]  #都需要利用到列表生成式"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "这样的操作比较麻烦，而且在数据量特别大的时候会非常耗时间。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "如果我们使用Numpy，就会变得特别简单"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1, 2, 3, 4])"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a = np.array([1,2,3,4])\n",
    "a"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([2, 3, 4, 5])"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a+1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([2, 4, 6, 8])"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a*2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([3, 5, 7, 9])"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "b = np.array([2,3,4,5])\n",
    "a + b"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 产生数组"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "从列表产生数组："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0, 1, 2, 3])"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "l = [0,1,2,3]\n",
    "a = np.array(l)\n",
    "a"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "从列表传入："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1, 2, 3, 4])"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a = np.array([1,2,3,4])\n",
    "a"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "生成全0数组："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0., 0., 0., 0., 0.])"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.zeros(5) #括号内传个数，默认浮点数"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "生成全1的数组："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1., 1., 1., 1., 1.])"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.ones(5) #括号内传个数，默认浮点数"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ True,  True,  True,  True,  True])"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.ones(5,dtype=\"bool\") #可以自己指定类型，np.zeros函数同理"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "可以使用 fill 方法将数组设为指定值"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1, 2, 3, 4])"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a = np.array([1,2,3,4])\n",
    "a"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([5, 5, 5, 5])"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a.fill(5) #让数组中的每一个元素都等于5\n",
    "a"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "与列表不同，数组中要求所有元素的 dtype 是一样的，如果传入参数的类型与数组类型不一样，需要按照已有的类型进行转换。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([2, 2, 2, 2])"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a.fill(2.5) #自动进行取整\n",
    "a"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([2.5, 2.5, 2.5, 2.5])"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a = a.astype(\"float\") #强制类型转换\n",
    "a.fill(2.5)\n",
    "a"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "还可以使用一些特定的方法生成特殊的数组"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "生成整数序列："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1, 2, 3, 4, 5, 6, 7, 8, 9])"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a = np.arange(1,10) #左闭右开区间，和range的使用方式同理\n",
    "a"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "生成等差数列："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 1.  ,  1.45,  1.9 ,  2.35,  2.8 ,  3.25,  3.7 ,  4.15,  4.6 ,\n",
       "        5.05,  5.5 ,  5.95,  6.4 ,  6.85,  7.3 ,  7.75,  8.2 ,  8.65,\n",
       "        9.1 ,  9.55, 10.  ])"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a = np.linspace(1,10,21) #右边是包括在里面的，从a-b一共c个数的等差数列，其实np.arange好像也可以做...\n",
    "a"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "生成随机数"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0.59885325, 0.10557219, 0.59791129, 0.29115235, 0.32322577,\n",
       "       0.57695178, 0.61243804, 0.06854357, 0.72870981, 0.20689729])"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.random.rand(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 1.04258178, -1.208267  ,  0.55101228,  1.7009833 ,  0.01138422,\n",
       "        0.93007928, -0.08109838, -0.06858875,  1.14998688, -0.60450321])"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.random.randn(10) #标准正态分布"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([11,  8,  4,  2,  2, 13,  9, 13, 10,  6])"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.random.randint(1,20,10) #生成随机整数，从1-20中随机10个"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 数组属性"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "查看类型："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 1.  ,  1.45,  1.9 ,  2.35,  2.8 ,  3.25,  3.7 ,  4.15,  4.6 ,\n",
       "        5.05,  5.5 ,  5.95,  6.4 ,  6.85,  7.3 ,  7.75,  8.2 ,  8.65,\n",
       "        9.1 ,  9.55, 10.  ])"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "numpy.ndarray"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "type(a)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "查看数组中的数据类型："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "dtype('float64')"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a.dtype"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "查看形状，会返回一个元组，每个元素代表这一维的元素数目："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(21,)"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "或者使用："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(21,)"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.shape(a)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "要看数组里面元素的个数："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "21"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a.size"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "查看数组的维度："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a.ndim"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 索引和切片"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "和列表相似，数组也支持索引和切片操作。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "索引第一个元素："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a = np.array([0,1,2,3])\n",
    "a[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "修改第一个元素的值"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([10,  1,  2,  3])"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a[0] = 10\n",
    "a"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`切片，支持负索引："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([12, 13])"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a = np.array([11,12,13,14,15])\n",
    "a[1:3] #左闭右开，从0开始算"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([12, 13])"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a[1:-2] #等价于a[1:3]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([12, 13])"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a[-4:3] #仍然等价a[1:3]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "省略参数："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([14, 15])"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a[-2:] #从倒数第2个取到底"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([11, 13, 15])"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a[::2] #从头取到尾，间隔2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "假设我们记录一部电影的累计票房："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([21000, 21800, 22240, 23450, 25000])"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ob = np.array([21000,21800,22240,23450,25000])\n",
    "ob"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "可以这样计算每天的票房："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 800,  440, 1210, 1550])"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ob2 = ob[1:]-ob[:-1]\n",
    "ob2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 多维数组及其属性"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "array还可以用来生成多维数组："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0,  1,  2,  3],\n",
       "       [10, 11, 12, 13]])"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a = np.array([[0,1,2,3],[10,11,12,13]])\n",
    "a"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "事实上我们传入的是一个以列表为元素的列表，最终得到一个二维数组。\n",
    "\n",
    "查看形状："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(2, 4)"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "查看总的元素个数："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "8"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a.size"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "查看维数："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a.ndim"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 多维数组索引"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "对于二维数组，可以传入两个数字来索引："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0,  1,  2,  3],\n",
       "       [10, 11, 12, 13]])"
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "13"
      ]
     },
     "execution_count": 46,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a[1,3]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "其中，1是行索引，3是列索引，中间用逗号隔开。事实上，Python会将它们看成一个元组（1,3），然后按照顺序进行对应。\n",
    "\n",
    "可以利用索引给它赋值："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0,  1,  2,  3],\n",
       "       [10, 11, 12, -1]])"
      ]
     },
     "execution_count": 47,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a[1,3] = -1\n",
    "a"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "事实上，我们还可以使用单个索引来索引一整行内容："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([10, 11, 12, -1])"
      ]
     },
     "execution_count": 48,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a[1]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Python会将这单个元组当成对第一维的索引，然后返回对应的内容。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 1, 11])"
      ]
     },
     "execution_count": 49,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a[:,1]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 多维数组切片 "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "多维数组，也支持切片操作："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0,  1,  2,  3,  4,  5],\n",
       "       [10, 11, 12, 13, 14, 15],\n",
       "       [20, 21, 22, 23, 24, 25],\n",
       "       [30, 31, 32, 33, 34, 35],\n",
       "       [40, 41, 42, 43, 44, 45],\n",
       "       [50, 51, 52, 53, 54, 55]])"
      ]
     },
     "execution_count": 50,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a = np.array([[0,1,2,3,4,5],[10,11,12,13,14,15],[20,21,22,23,24,25],[30,31,32,33,34,35],[40,41,42,43,44,45],[50,51,52,53,54,55]])\n",
    "a"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "想得到第一行的第4和第5两个元素："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([3, 4])"
      ]
     },
     "execution_count": 51,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a[0,3:5]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "得到最后两行的最后两列："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[44, 45],\n",
       "       [54, 55]])"
      ]
     },
     "execution_count": 52,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a[4:,4:]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "得到第三列："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 2, 12, 22, 32, 42, 52])"
      ]
     },
     "execution_count": 53,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a[:,2]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "每一维都支持切片的规则，包括负索引，省略\n",
    "\n",
    "[lower:upper:step]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "例如，取出3,5行的奇数列："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[20, 22, 24],\n",
       "       [40, 42, 44]])"
      ]
     },
     "execution_count": 54,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a[2::2,::2]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 切片是引用 "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "切片在内存中使用的是引用机制"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[2 3]\n"
     ]
    }
   ],
   "source": [
    "a = np.array([0,1,2,3,4])\n",
    "b = a[2:4]\n",
    "print(b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "引用机制意味着，Python并没有为b分配新的空间来存储它的值，而是让b指向了a所分配的内存空间，因此，改变b会改变a的值："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 0,  1, 10,  3,  4])"
      ]
     },
     "execution_count": 56,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "b[0] = 10\n",
    "a"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "而这种现象在列表中并不会出现："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1, 2, 3, 4, 5]\n"
     ]
    }
   ],
   "source": [
    "a = [1,2,3,4,5]\n",
    "b = a[2:4]\n",
    "b[0] = 10\n",
    "print(a)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "这样做的好处在于，对于很大的数组，不用大量复制多余的值，节约了空间。\n",
    "\n",
    "缺点在于，可能出现改变一个值改变另一个值的情况。\n",
    "\n",
    "一个解决方法是使用copy()方法产生一个复制，这个复制会申请新的内存："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0, 1, 2, 3, 4])"
      ]
     },
     "execution_count": 58,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a = np.array([0,1,2,3,4])\n",
    "b = a[2:4].copy()\n",
    "b[0] = 10\n",
    "a"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 花式索引 "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "切片只能支持连续或者等间隔的切片操作，要想实现任意位置的操作。需要使用花式索引 fancy slicing。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 一维花式索引 "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "与range函数类似，我们可以使用arange函数来产生等差数组。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])"
      ]
     },
     "execution_count": 59,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a = np.arange(0,100,10)\n",
    "a"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "花式索引需要指定索引位置："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "index = [1,2,-3]\n",
    "y = a[index]\n",
    "print(y)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "还可以使用布尔数组来花式索引："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "mask = np.array([0,2,2,0,0,1,0,0,1,0],dtype = bool)\n",
    "mask"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "mask必须是布尔数组，长度必须和数组长度相等。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "a[mask]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 二维花式索引 "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "对于二维花式索引，我们需要给定行和列的值："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "a = np.array([[0,1,2,3,4,5],[10,11,12,13,14,15],[20,21,22,23,24,25],[30,31,32,33,34,35],[40,41,42,43,44,45],[50,51,52,53,54,55]])\n",
    "a"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "返回的是一条次对角线上的5个值。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "a[(0,1,2,3,4),(1,2,3,4,5)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "返回的是最后三行的1,3,5列。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "a[3:,[0,2,4]]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "也可以使用mask进行索引："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "mask = np.array([1,0,1,0,0,1],dtype = bool)\n",
    "a[mask,2]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "与切片不同，花式索引返回的是原对象的一个复制而不是引用。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### “不完全”索引 "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "只给定行索引的时候，返回整行："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "y = a[:3]\n",
    "y"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "这时候也可以使用花式索引取出第2,3,5行："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "con = np.array([0,1,1,0,1,0],dtype = bool)\n",
    "a[con]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### where语句\n",
    "\n",
    "```python\n",
    "where(array)\n",
    "```\n",
    "where函数会返回所有非零元素的索引。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 一维数组\n",
    "先看一维的例子："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "a = np.array([0,12,5,20])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "判断数组中的元素是不是大于10："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "a>10"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "数组中所有大于10的元素的索引位置："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "np.where(a>10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "注意到where的返回值是一个元组。返回的是索引位置，索引[1,3]大于10的数\n",
    "\n",
    "也可以直接用数组操作。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "a[a>10]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "a[np.where(a>10)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.数组类型 "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "具体如下：\n",
    "\n",
    "|**基本类型**|**可用的Numpy类型**|**备注**|\n",
    "|-:|-:|-:|\n",
    "|布尔型|bool|占一个字节|  \n",
    "|整型|int8,int16,int32,int64,int128,int|int跟C语言中long一样大|\n",
    "|无符号整型|uint8,uint16,uint32,uint64,uint128,uint|uint跟C语言中的unsigned long一样大|\n",
    "|浮点数|float16,float32,float|默认为双精度float64，longfloat精度大小与系统有关|\n",
    "|复数|complex64,complex128,complex,longcomplex|默认为complex128,即实部虚部都为双精度|\n",
    "|字符串|string,unicode|可以使用dtype=S4表示一个4字节字符串的数组|\n",
    "|对象|object|数组中可以使用任意值|\n",
    "|时间|datetime64,timedelta64||"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 类型转换"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "a = np.array([1.5,-3],dtype = float)\n",
    "a"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### asarray 函数 "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "a = np.array([1,2,3])\n",
    "np.asarray(a,dtype = float)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### astype方法\n",
    "\n",
    "astype 方法返回一个新数组："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "a = np.array([1,2,3])\n",
    "a.astype(float)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "a #a本身并没有发生变化--拷贝"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3.数组操作 "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 我们以豆瓣10部高分电影为例 "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "##电影名称\n",
    "mv_name = [\"肖申克的救赎\",\"控方证人\",\"美丽人生\",\"阿甘正传\",\"霸王别姬\",\"泰坦尼克号\",\"辛德勒的名单\",\"这个杀手不太冷\",\"疯狂动物城\",\"海豚湾\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "##评分人数\n",
    "mv_num = np.array([692795,42995,327855,580897,478523,157074,306904,662552,284652,159302])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "##评分\n",
    "mv_score = np.array([9.6,9.5,9.5,9.4,9.4,9.4,9.4,9.3,9.3,9.3])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "##电影时长（分钟）\n",
    "mv_length = np.array([142,116,116,142,171,194,195,133,109,92])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 数组排序"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### sort函数"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "np.sort(mv_num)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "mv_num #sort不改变原来数组"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### argsort函数 \n",
    "\n",
    "argsort返回从小到大的排列在数组中的索引位置："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "order = np.argsort(mv_num)\n",
    "order"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "mv_name[order[0]]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "mv_name[order[-1]]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 求和 "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "np.sum(mv_num)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "mv_num.sum()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 最大值 "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "np.max(mv_length)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "mv_length.max()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 最小值 "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "np.min(mv_score)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "mv_score.min()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 均值 "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "np.mean(mv_length)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "mv_length.mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 标准差 "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "np.std(mv_length)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "mv_length.std()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 相关系数矩阵 "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "np.cov(mv_score,mv_length)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4.多维数组操作 "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 数组形状 "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "a = np.arange(6)\n",
    "a"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "a.shape=(2,3)\n",
    "a"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "a.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "与之对应的方法是reshape，但它不会修改原来数组的值，而是返回一个新的数组："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "a = np.arange(6)\n",
    "a"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "a.reshape(2,3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "a #没变"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 转置 "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "a = a.reshape(2,3)\n",
    "a"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "a.T"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "a.transpose() #只要没赋值给本身，a的数值不会变换"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 数组连接 "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "有时候我们需要将不同的数组按照一定的顺序连接起来：  \n",
    "concatenate((a0,a1,...,aN),axis = 0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "注意，这些数组要用()包括到一个元组中去。  \n",
    "除了给定的轴外，这些数组其他轴的长度必须是一样的。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "x = np.array([[0,1,2],[10,11,12]])\n",
    "y = np.array([[50,51,52],[60,61,62]])\n",
    "print(x.shape)\n",
    "print(y.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "默认沿着第一维进行连接："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "z = np.concatenate((x,y))\n",
    "z"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "沿着第二维进行连接："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "z = np.concatenate((x,y),axis = 1)\n",
    "z"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "注意到这里x和y的形状是一样的，还可以将它们连接成三维的数组，但是concatenate不能提供这样的功能，不过可以这样："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "z = np.array((x,y))\n",
    "z"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "事实上，Numpy提供了分别对应这三种情况的函数：\n",
    "* vstack\n",
    "* hstack\n",
    "* dstack"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "np.vstack((x,y))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "np.dstack((x,y))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5.Numpy内置函数"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "a = np.array([-1,2,3,-2])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "np.abs(a) #绝对值"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "np.exp(a) #指数"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "np.median(a) #中值"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "is_executing": true
    }
   },
   "outputs": [],
   "source": [
    "np.cumsum(a) #累积和"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "numpy的内置函数非常多，不需要死记，懂得查资料。\n",
    "\n",
    "https://blog.csdn.net/nihaoxiaocui/article/details/51992860?locationNum=5&fps=1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 6.数组属性方法总结 \n",
    "\n",
    "课上只讲了一些常见的，其余感兴趣的同学可以自行学习。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "|**调用方法**|**作用**|\n",
    "|-:|-:|\n",
    "|**1**|**基本属性**|\n",
    "|a.dtype|数组元素类型float32,uint8,...|\n",
    "|a.shape|数组形状(m,n,o,...)|\n",
    "|a.size|数组元素数|\n",
    "|a.itemsize|每个元素占字节数|\n",
    "|a.nbytes|所有元素占的字节|\n",
    "|a.ndim|数组维度|\n",
    "|-|-|\n",
    "|**2**|**形状相关**|\n",
    "|a.flat|所有元素的迭代器|\n",
    "|a.flatten()|返回一个1维数组的复制|\n",
    "|a.ravel()|返回一个一维数组，高效|\n",
    "|a.resize(new_size)|改变形状|\n",
    "|a.swapaxes(axis1,axis2)|交换两个维度的位置|\n",
    "|a.transpose(* axex)|交换所有维度的位置|\n",
    "|a.T|转置，a.transpose()|\n",
    "|a.squeeze()|去除所有长度为1的维度|\n",
    "|-|-|\n",
    "|**3**|**填充复制**|\n",
    "|a.copy()|返回数组的一个复制|\n",
    "|a.fill(value)|将数组的元组设置为特定值|\n",
    "|-|-|\n",
    "|**4**|**转化**|\n",
    "|a.tolist()|将数组转化为列表|\n",
    "|a.tostring()|转换为字符串|\n",
    "|a.astype(dtype)|转换为指定类型|\n",
    "|a.byteswap(False)|转换大小字节序|\n",
    "|a.view(type_or_dtype)|生成一个使用相同内存，但使用不同的表示方法的数组|\n",
    "|-|-|\n",
    "|**5**|**查找排序**|\n",
    "|a.nonzero()|返回所有非零元素的索引|\n",
    "|a.sort(axis=-1)|沿某个轴排序|\n",
    "|a.argsort(axis=-1)|沿某个轴，返回按排序的索引|\n",
    "|a.searchsorted(b)|返回将b中元素插入a后能保持有序的索引值|\n",
    "|-|-|\n",
    "|**6**|**元素数学操作**|\n",
    "|a.clip(low,high)|将数值限制在一定范围内|\n",
    "|a.round(decimals=0)|近似到指定精度|\n",
    "|a.cumsum(axis=None)|累加和|\n",
    "|a.cumprod(axis=None)|累乘积|\n",
    "|-|-|\n",
    "|**7**|**约简操作**|\n",
    "|a.sum(axis=None)|求和|\n",
    "|a.prod(axis=None)|求积|\n",
    "|a.min(axis=None)|最小值|\n",
    "|a.max(axis=None)|最大值|\n",
    "|a.argmin(axis=None)|最小值索引|\n",
    "|a.argmax(axis=None)|最大值索引|\n",
    "|a.ptp(axis=None)|最大值减最小值|\n",
    "|a.mean(axis=None)|平均值|\n",
    "|a.std(axis=None)|标准差|\n",
    "|a.var(axis=None)|方差|\n",
    "|a.any(axis=None)|只要有一个不为0，返回真，逻辑或|\n",
    "|a.all(axis=None)|所有都不为0，返回真，逻辑与|"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}